Search CORE

48 research outputs found

A powerful method for detecting differentially expressed genes from GeneChip arrays that does not require replicates

Author: AK Hein
Anne-Mette K Hein
B Efron
CM Kendziorski
DB Allison
GK Smyth
KK Lin
P Baldi
R Gottardo
RA Irizarry
RC Gentleman
S Richardson
SE Choe
Sylvia Richardson
VG Tusher
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Studies of differential expression that use Affymetrix GeneChip arrays are often carried out with a limited number of replicates. Reasons for this include financial considerations and limits on the available amount of RNA for sample preparation. In addition, failed hybridizations are not uncommon leading to a further reduction in the number of replicates available for analysis. Most existing methods for studying differential expression rely on the availability of replicates and the demand for alternative methods that require few or no replicates is high. RESULTS: We describe a statistical procedure for performing differential expression analysis without replicates. The procedure relies on a Bayesian integrated approach (BGX) to the analysis of Affymetrix GeneChips. The BGX method estimates a posterior distribution of expression for each gene and condition, from a simultaneous consideration of the available probe intensities representing the gene in a condition. Importantly, posterior distributions of expression are obtained regardless of the number of replicates available. We exploit these posterior distributions to create ranked gene lists that take into account the estimated expression difference as well as its associated uncertainty. We estimate the proportion of non-differentially expressed genes empirically, allowing an informed choice of cut-off for the ranked gene list, adapting an approach proposed by Efron. We assess the performance of the method, and compare it to those of other methods, on publicly available spike-in data sets, as well as in a proper biological setting. CONCLUSION: The method presented is a powerful tool for extracting information on differential expression from GeneChip expression studies with limited or no replicates

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A comprehensive evaluation of SAM, the SAM R-package and a simple modification to improve its performance

Author: AJ Rice
B Efron
B Efron
CM Kendziorski
GK Smyth
JG Thomas
M Newton
MA Newton
MK Kerr
O Larsson
P Delmar
S Dudoit
S Zhang
Shunpu Zhang
TR Golub
VG Tusher
W Huber
W Pan
W Pan
X Guo
Y Xie
Y Zhao
Publication venue: BioMed Central
Publication date: 01/06/2007
Field of study

Abstract Background The Significance Analysis of Microarrays (SAM) is a popular method for detecting significantly expressed genes and controlling the false discovery rate (FDR). Recently, it has been reported in the literature that the FDR is not well controlled by SAM. Due to the vast application of SAM in microarray data analysis, it is of great importance to have an extensive evaluation of SAM and its associated R-package (sam2.20). Results Our study has identified several discrepancies between SAM and sam2.20. One major difference is that SAM and sam2.20 use different methods for estimating FDR. Such discrepancies may cause confusion among the researchers who are using SAM or are developing the SAM-like methods. We have also shown that SAM provides no meaningful estimates of FDR and this problem has been corrected in sam2.20 by using a different formula for estimating FDR. However, we have found that, even with the improvement sam2.20 has made over SAM, sam2.20 may still produce erroneous and even conflicting results under certain situations. Using an example, we show that the problem of sam2.20 is caused by its use of asymmetric cutoffs which are due to the large variability of null scores at both ends of the order statistics. An obvious approach without the complication of the order statistics is the conventional symmetric cutoff method. For this reason, we have carried out extensive simulations to compare the performance of sam2.20 and the symmetric cutoff method. Finally, a simple modification is proposed to improve the FDR estimation of sam2.20 and the symmetric cutoff method. Conclusion Our study shows that the most serious drawback of SAM is its poor estimation of FDR. Although this drawback has been corrected in sam2.20, the control of FDR by sam2.20 is still not satisfactory. The comparison between sam2.20 and the symmetric cutoff method reveals that the relative performance of sam2.20 to the symmetric cutff method depends on the ratio of induced to repressed genes in a microarray data, and is also affected by the ratio of DE to EE genes and the distributions of induced and repressed genes. Numerical simulations show that the symmetric cutoff method has the biggest advantage over sam2.20 when there are equal number of induced and repressed genes (i.e., the ratio of induced to repressed genes is 1). As the ratio of induced to repressed genes moves away from 1, the advantage of the symmetric cutoff method to sam2.20 is gradually diminishing until eventually sam2.20 becomes significantly better than the symmetric cutoff method when the differentially expressed (DE) genes are either all induced or all repressed genes. Simulation results also show that our proposed simple modification provides improved control of FDR for both sam2.20 and the symmetric cutoff method.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Use of DNA–Damaging Agents and RNA Pooling to Assess Expression Profiles Associated with BRCA1 and BRCA2 Mutation Status in Familial Breast Cancer Patients

Author: Amanda B. Spurdle
B Weigelt
Barbara E. Stranger
Bryony A. Thompson
C Guillouf
C Kendziorski
CM Kendziorski
CM Perou
CR Correa
D Agrawal
DA Smirnov
EL Korn
EL Korn
G Finak
GJ Mann
I Hedenfalk
I Hedenfalk
JH Shih
K Arnold
K Manju
kConFab Investigators
L Melchor
LC Walker
Logan C. Walker
M Warren
MD Radmacher
N Waddell
N Waddell
N Waddell
NG Howlett
Nic Waddell
OA Stefansson
P Smith
PA Lachenbruch
R Tibshirani
S Dudoit
S Ramaswamy
SA Joosse
Sean M. Grimmond
T Sorlie
T Sorlie
VG Cheung
W Enard
W Wang
W Zhang
X Peng
Z Kote-Jarai
Z Kote-Jarai
Publication venue: Public Library of Science
Publication date: 01/02/2010
Field of study

A large number of rare sequence variants of unknown clinical significance have been identified in the breast cancer susceptibility genes, BRCA1 and BRCA2. Laboratory-based methods that can distinguish between carriers of pathogenic mutations and non-carriers are likely to have utility for the classification of these sequence variants. To identify predictors of pathogenic mutation status in familial breast cancer patients, we explored the use of gene expression arrays to assess the effect of two DNA–damaging agents (irradiation and mitomycin C) on cellular response in relation to BRCA1 and BRCA2 mutation status. A range of regimes was used to treat 27 lymphoblastoid cell-lines (LCLs) derived from affected women in high-risk breast cancer families (nine BRCA1, nine BRCA2, and nine non-BRCA1/2 or BRCAX individuals) and nine LCLs from healthy individuals. Using an RNA–pooling strategy, we found that treating LCLs with 1.2 µM mitomycin C and measuring the gene expression profiles 1 hour post-treatment had the greatest potential to discriminate BRCA1, BRCA2, and BRCAX mutation status. A classifier was built using the expression profile of nine QRT–PCR validated genes that were associated with BRCA1, BRCA2, and BRCAX status in RNA pools. These nine genes could distinguish BRCA1 from BRCA2 carriers with 83% accuracy in individual samples, but three-way analysis for BRCA1, BRCA2, and BRCAX had a maximum of 59% prediction accuracy. Our results suggest that, compared to BRCA1 and BRCA2 mutation carriers, non-BRCA1/2 (BRCAX) individuals are genetically heterogeneous. This study also demonstrates the effectiveness of RNA pools to compare the expression profiles of cell-lines from BRCA1, BRCA2, and BRCAX cases after treatment with irradiation and mitomycin C as a method to prioritize treatment regimes for detailed downstream expression analysis

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

University of Queensland eSpace

Probe-level linear model fitting and mixture modeling results in high accuracy detection of differential gene expression

Author: AP Dempster
B Efron
BM Bolstad
C Fraley
C Kendziorski
CM Bishop
GJ McLachlan
GK Smyth
J Neter
JH Shih
L Barrera
MK Kerr
P Baldi
RA Irizarry
RA Jolly
RM Simon
S Dudoit
S Hochreiter
S Mukherjee
SD Zhang
SE Choe
Sébastien Lemieux
VG Tusher
WS Cleveland
Y Benjamini
Z Jia
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The identification of differentially expressed genes (DEGs) from Affymetrix GeneChips arrays is currently done by first computing expression levels from the low-level probe intensities, then deriving significance by comparing these expression levels between conditions. The proposed PL-LM (Probe-Level Linear Model) method implements a linear model applied on the probe-level data to directly estimate the treatment effect. A finite mixture of Gaussian components is then used to identify DEGs using the coefficients estimated by the linear model. This approach can readily be applied to experimental design with or without replication. RESULTS: On a wholly defined dataset, the PL-LM method was able to identify 75% of the differentially expressed genes within 10% of false positives. This accuracy was achieved both using the three replicates per conditions available in the dataset and using only one replicate per condition. CONCLUSION: The method achieves, on this dataset, a higher accuracy than the best set of tools identified by the authors of the dataset, and does so using only one replicate per condition

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Dépôt Institutionnel Numérique

Peroxiredoxin 2: a potential biomarker for early diagnosis of Hepatitis B Virus related liver fibrosis identified by proteomic analysis of the plasma

Author: A Xu
B Hofmann
BJ Kim
C Kendziorski
C Srisomsap
Chengzhao Lin
CM Kendziorski
CS Klade
CT Wai
E Janig
F Oberti
FP Wayne
Fuchu He
G Chen
G Meneses-Lorente
H Wang
Haijian Wang
HY Tang
HZ Chae
I El-Gindy
I Shimizu
I Shimizu
J Chen
J Guechot
J Guéchot
J Kim
JD Wulfkuhle
Jie Liu
Jiyao Wang
JM Moreira
JO Schorge
KL Meehan
LA Echan
LY Zhang
M Unlu
MF Willard
MJ Arthur
NL Anderson
Pengyuan Yang
PH O'Farrell
QY He
RW Putnam
S Lv
VJ Desmet
VN dos Santos
W Zhang
WH Heijne
X Li
X Peng
Ye Lu
Ying Jiang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

EzArray: A web-based highly automated Affymetrix expression array data management and analysis system

Author: A Brazma
BM Bolstad
C Li
C Romualdi
CM Kendziorski
D Rajagopalan
E Hubbell
GK Smyth
H Rehrauer
HM Hsueh
J Rainer
JM Vaquerizas
JM Wettenhall
K Hokamp
L Jones
M Kapushesky
M Psarros
MA Newton
O Larsson
R Diaz-Uriarte
R Edgar
R Ihaka
RA Irizarry
RA Irizarry
S Dudoit
S Vardhanabhuti
S Zhang
VG Tusher
Wei Xu
WK Lim
WM Liu
X Xia
Y Barash
Yuelin Zhu
Yuerong Zhu
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Though microarray experiments are very popular in life science research, managing and analyzing microarray data are still challenging tasks for many biologists. Most microarray programs require users to have sophisticated knowledge of mathematics, statistics and computer skills for usage. With accumulating microarray data deposited in public databases, easy-to-use programs to re-analyze previously published microarray data are in high demand. Results EzArray is a web-based Affymetrix expression array data management and analysis system for researchers who need to organize microarray data efficiently and get data analyzed instantly. EzArray organizes microarray data into projects that can be analyzed online with predefined or custom procedures. EzArray performs data preprocessing and detection of differentially expressed genes with statistical methods. All analysis procedures are optimized and highly automated so that even novice users with limited pre-knowledge of microarray data analysis can complete initial analysis quickly. Since all input files, analysis parameters, and executed scripts can be downloaded, EzArray provides maximum reproducibility for each analysis. In addition, EzArray integrates with Gene Expression Omnibus (GEO) and allows instantaneous re-analysis of published array data. Conclusion EzArray is a novel Affymetrix expression array data analysis and sharing system. EzArray provides easy-to-use tools for re-analyzing published microarray data and will help both novice and experienced users perform initial analysis of their microarray data from the location of data storage. We believe EzArray will be a useful system for facilities with microarray services and laboratories with multiple members involved in microarray data analysis. EzArray is freely available from <url>http://www.ezarray.com/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Multivariate hierarchical Bayesian model for differential gene expression analysis in microarray experiments

Author: A Lewin
B Efron
BP Durbin
CM Kendziorski
D Amaratunga
D Shalon
DA Notterman
Hong Yan
Hongya Zhao
JD Storey
JG Ibrahim
K Lo
Kwok-Leung Chan
Lee-Ming Cheng
M Schena
M Schena
MA Harris
MA Newton
MA Newton
MA Sartor
N Dean
P Baldi
P Broet
P Sham
PM Lee
PO Brown
R Delongchamp
R Gottardo
S Dudoit
S Wang
SOM Manda
VG Tusher
W Huber
Y Benjamini
Y Chen
YH Yang
YH Yang
YH Yang
Publication venue: BioMed Central
Publication date: 13/02/2008
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

A Bayesian Partition Method for Detecting Pleiotropic and Epistatic eQTL Modules

Author: A Colman-Lerner
A Manichaikul
AC Cervino
AH Enyenihi
BM Bolstad
C Jiang
CJ Geyer
CM Kendziorski
CY Wu
D Mangin
EE Schadt
EE Schadt
EE Schadt
Eric E. Schadt
ES Lander
G Yvert
Gary D. Stormo
J Ronald
J Zhu
JD Storey
JS Liu
Jun S. Liu
Jun Zhu
KD MacIsaac
M Morley
N Yi
PJ Green
RB Brem
RB Brem
RB Brem
SI Lee
TR Hughes
V Emilsson
W Zou
Wei Zhang
Y Chen
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Studies of the relationship between DNA variation and gene expression variation, often referred to as “expression quantitative trait loci (eQTL) mapping”, have been conducted in many species and resulted in many significant findings. Because of the large number of genes and genetic markers in such analyses, it is extremely challenging to discover how a small number of eQTLs interact with each other to affect mRNA expression levels for a set of co-regulated genes. We present a Bayesian method to facilitate the task, in which co-expressed genes mapped to a common set of markers are treated as a module characterized by latent indicator variables. A Markov chain Monte Carlo algorithm is designed to search simultaneously for the module genes and their linked markers. We show by simulations that this method is more powerful for detecting true eQTLs and their target genes than traditional QTL mapping methods. We applied the procedure to a data set consisting of gene expression and genotypes for 112 segregants of S. cerevisiae. Our method identified modules containing genes mapped to previously reported eQTL hot spots, and dissected these large eQTL hot spots into several modules corresponding to possibly different biological functions or primary and secondary responses to regulatory perturbations. In addition, we identified nine modules associated with pairs of eQTLs, of which two have been previously reported. We demonstrated that one of the novel modules containing many daughter-cell expressed genes is regulated by AMN1 and BPH1. In conclusion, the Bayesian partition method which simultaneously considers all traits and all markers is more powerful for detecting both pleiotropic and epistatic effects based on both simulated and empirical data

CiteSeerX

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

Probabilistic Inference for Nucleosome Positioning with MNase-Based or Sonicated Short-Read Data

Author: A Barski
A Longo
A Weiner
AD Goldberg
AG Robertson
BE Bernstein
BG Hoffman
Brad G. Hoffman
C Jiang
CM Kendziorski
DJ Clark
E Tong
Francisco José Esteban
G Li
G Robertson
Gordon Robertson
H Kim
HH He
I Albert
J Besag
J Ernst
J Rozowsky
K Lo
LA Cirillo
M Radman-Livaja
ML Conerly
MS Ong
N Kaplan
ND Heintzman
PF Kuan
Raphael Gottardo
RS Edayathumangalam
S Heinz
S Roy
Sangsoon Woo
SJ Elsaesser
U Munzel
X Zhang
Xuekui Zhang
Y Zhang
Z Zhang
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

We describe a model-based method, PING, for predicting nucleosome positions in MNase-Seq and MNase- or sonicated-ChIP-Seq data. PING compares favorably to NPS and TemplateFilter in scalability, accuracy and robustness to low read density. To demonstrate that PING predictions from widely available sonicated data can have sufficient spatial resolution to be to be useful for biological inference, we use Illumina H3K4me1 ChIP-seq data to detect changes in nucleosome positioning around transcription factor binding sites due to tamoxifen stimulation, to discriminate functional and non-functional transcription factor binding sites more effectively than with enrichment profiles, and to confirm that the pioneer transcription factor Foxa2 associates with the accessible major groove of nucleosomal DNA

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Advanced Computational Biology Methods Identify Molecular Switches for Malignancy in an EGF Mouse Model of Liver Cancer

Author: A Hosui
A Kel
A Kel
A Sala
A Seth
AE Kel
AE Kel
Alexander Kel
AM Waterhouse
AP Feinberg
C Desbois-Mouthon
C Yang
CD Schmid
CM Kendziorski
CM Shea
DL Galson
E Wingender
EC Lopes
Edgar Wingender
FM van Roy
G Otaegi
H Michael
HE Jones
J Borlak
J Jiang
J Riedemann
JA Figueroa
JJ Shah
Juergen Borlak
K Hayashida
K Katoh
KH Ventii
LC Yeh
M Ashburner
M Krull
M Mietus-Snyder
MH Tai
Michael Polymenis
Nico Voss
P Carninci
P Nioi
Philip Stegmaier
R Yamashita
RC Gentleman
S Morin
S Rahmann
S Zhang
T Pham-Gia
Tatiana Meier
TJP Hubbard
TM DeChiara
V Matys
VX Fu
Y Babaie
Y Fu
Y Guo
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The molecular causes by which the epidermal growth factor receptor tyrosine kinase induces malignant transformation are largely unknown. To better understand EGFs' transforming capacity whole genome scans were applied to a transgenic mouse model of liver cancer and subjected to advanced methods of computational analysis to construct de novo gene regulatory networks based on a combination of sequence analysis and entrained graph-topological algorithms. Here we identified transcription factors, processes, key nodes and molecules to connect as yet unknown interacting partners at the level of protein-DNA interaction. Many of those could be confirmed by electromobility band shift assay at recognition sites of gene specific promoters and by western blotting of nuclear proteins. A novel cellular regulatory circuitry could therefore be proposed that connects cell cycle regulated genes with components of the EGF signaling pathway. Promoter analysis of differentially expressed genes suggested the majority of regulated transcription factors to display specificity to either the pre-tumor or the tumor state. Subsequent search for signal transduction key nodes upstream of the identified transcription factors and their targets suggested the insulin-like growth factor pathway to render the tumor cells independent of EGF receptor activity. Notably, expression of IGF2 in addition to many components of this pathway was highly upregulated in tumors. Together, we propose a switch in autocrine signaling to foster tumor growth that was initially triggered by EGF and demonstrate the knowledge gain form promoter analysis combined with upstream key node identification

Crossref

Fraunhofer-ePrints

Directory of Open Access Journals

PubMed Central